Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The ability to automatically assess learners' activities is the key to user modeling and personalization in adaptive educational systems.The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at the GitHub repository (https://github.com/jeevanchaps/SelfCode).more » « less
-
Domain modeling is a central component in education technologies as it represents the target domain students are supposed to train on and eventually master. Automatically generating domain models can lead to substantial cost and scalability benefits. Automatically extracting key concepts or knowledge components from, for instance, textbooks can enable the development of automatic or semi-automatic processes for creating domain models. We explore in this work the use of transformer based pre-trained models for the task of keyphrase extraction. Specifically, we investigate and evaluate four different variants of BERT, a pre-trained transformer based architecture, that vary in terms of training data, training objective, or training strategy to extract knowledge components from textbooks for the domain of intro-to-programming. We report results obtained using the following BERT-based models: BERT, CodeBERT, SciBERT and RoBERTa.more » « less
-
Crossley, Scott; Popescu, Elvira (Ed.)We present here a novel instructional resource, called DeepCode, to support deep code comprehension and learning in intro-to-programming courses (CS1 and CS2). DeepCode is a set of instructional code examples which we call a codeset and which was annotated by our team with comments (e.g., explaining the logical steps of the underlying problem being solved) and related instructional questions that can play the role of hints meant to help learners think about and articulate explanations of the code. While DeepCode was designed primarily to serve our larger efforts of developing an intelligent tutoring system (ITS) that fosters the monitoring, assessment, and development of code comprehension skills for students learning to program, the codeset can be used for other purposes such as assessment, problem-solving, and in various other learning activities such as studying worked-out code examples with explanations and code visualizations. We present here the underlying principles, theories, and frameworks behind our design process, the annotation guidelines, and summarize the resulting codeset of 98 annotated Java code examples which include 7,157 lines of code (including comments), 260 logical steps, 260 logical step details, 408 statement level comments, and 590 scaffolding questions.more » « less
-
null (Ed.)The transfer learning pretraining-finetuning paradigm has revolutionized the natural language processing field yielding state-of the art results in several subfields such as text classification and question answering. However, little work has been done investigating pretrained language models for the open student answer assessment task. In this paper, we fine tune pretrained T5, BERT, RoBERTa, DistilBERT, ALBERT and XLNet models on the DT-Grade dataset which contains freely generated (or open) student answers together with judgment of their correctness. The experimental results demonstrated the effectiveness of these models based on the transfer learning pretraining-finetuning paradigm for open student answer assessment. An improvement of 8%-15% in accuracy was obtained over previous methods. Particularly, a T5 based method led to state-of-the-art results with an accuracy and F1 score of 0.88.more » « less
-
We present in this paper an automated method to assess the quality of Jupyter notebooks. The quality of notebooks is assessed in terms of reproducibility and executability. Specifically, we automatically extract a number of expert-defined features for each notebook, perform a feature selection step, and then trained supervised binary classifiers to predict whether a notebook is reproducible and executable, respectively. We also experimented with semantic code embeddings to capture the notebooks' semantics. We have evaluated these methods on a dataset of 306,539 notebooks and achieved an F1 score of 0.87 for reproducibility and 0.96 for executability (using expert-defined features) and an F1 score of 0.81 for reproducibility and 0.78 for executability (using code embeddings). Our results suggest that semantic code embeddings can be used to determine with good performance the reproducibility and executability of Jupyter notebooks, and since they can be automatically derived, they have the advantage of no need for expert involvement to define features.more » « less
-
null (Ed.)We present in this paper the results of a randomized control trial experiment that compared the effectiveness of two instructional strategies that scaffold learners' code comprehension processes: eliciting Free Self-Explanation and a Socratic Method. Code comprehension, i.e., understanding source code, is a critical skill for both learners and professionals. Improving learners' code comprehension skills should result in improved learning which in turn should help with retention in intro-to-programming courses which are notorious for suffering from very high attrition rates due to the complexity of programming topics. To this end, the reported experiment is meant to explore the effectiveness of various strategies to elicit self-explanation as a way to improve comprehension and learning during complex code comprehension and learning activities in intro-to-programming courses. The experiment showed pre-/post-test learning gains of 30% (M = 0.30, SD = 0.47) for the Free Self-Explanation condition and learning gains of 59% (M = 0.59,SD = 0.39) for the Socratic method. Furthermore, we investigated the behavior of the two strategies as a function of students' prior knowledge which was measured using learners' pretest score. For the Free Self-Explanation condition, there was no significant difference in mean learning gains for low vs. high knowledge students. The magnitude of the difference in performance (mean difference= 0.02,95% CI: -0.34 to 0.39) was very small (eta squared = 0.006). Likewise, the Socratic method showed no significant difference in mean learning gains between low vs. high performing students. The magnitude of the performance difference (mean difference =-0.24,95% CI: -0.534 to 0.03) was large (eta squared = 0.10). These findings suggest that eliciting self-explanations can be used as an effective strategy and that guided self-explanations as in the Socratic method condition is more effective at inducing learning gains.more » « less
-
Computer Science (CS) education is critical in todays world, and introductory programming courses are considered extremely difficult and frustrating, often considered a major stumbling block for students willing to pursue computer programming related careers. In this paper, we describe the design of Socratic Tutor, an Intelligent Tutoring System that can help novice programmers to better understand programming concepts. The system was inspired by the Socratic method of teaching in which the main goal is to ask a set of guiding questions about key concepts and major steps or segments of complete code examples. To evaluate the Socratic Tutor, we conducted a pilot study with 34 computer science students and the results are promising in terms of learning gains.more » « less
-
This paper reports the findings of an empirical study on the effects and nature of self explanation during source code comprehension learning activities in the context of learning computer programming language Java. Our study shows that self explanation helps learning and there is a strong positive correlation between the volume of self-explanation students produce and how much they learn. Furthermore, selfexplanations as an instructional strategy has no discrepancy based on student’s prior knowledge. We found that participants explain target code examples using a combination of language, code references, and mathematical expressions. This is not surprising given the nature of the target item, a computer program, but this indicates that automatically evaluating such self-explanations may require novel techniques compared to self-explanations of narrative or scientific texts.more » « less
An official website of the United States government

Full Text Available